notebooks/tutorials/bigquery/BigQuery command-line tool.ipynb (209 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# BigQuery command-line tool\n",
"\n",
"The BigQuery command-line tool is installed as part of the [Cloud SDK](https://cloud-dot-devsite.googleplex.com/sdk/docs/) and can be used to interact with BigQuery. When you use CLI commands in a notebook, the command must be prepended with a `!`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## View available commands\n",
"\n",
"To view the available commands for the BigQuery command-line tool, use the `help` command."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!bq help"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a new dataset\n",
"\n",
"A dataset is contained within a specific [project](https://cloud.google.com/bigquery/docs/projects). Datasets are top-level containers that are used to organize and control access to your [tables](https://cloud.google.com/bigquery/docs/tables) and [views](https://cloud.google.com/bigquery/docs/views). A table or view must belong to a dataset. You need to create at least one dataset before [loading data into BigQuery](https://cloud.google.com/bigquery/loading-data-into-bigquery).\n",
"\n",
"First, name your new dataset:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dataset_id = \"your_new_dataset\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following command creates a new dataset in the US using the ID defined above.\n",
"\n",
"NOTE: In the examples in this notebook, the `dataset_id` variable is referenced in the commands using both `{}` and `$`. To avoid creating and using variables, replace these interpolated variables with literal values and remove the `{}` and `$` characters."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!bq --location=US mk --dataset $dataset_id"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The response should look like the following:\n",
"\n",
"```\n",
"Dataset 'your-project-id:your_new_dataset' successfully created.\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## List datasets\n",
"\n",
"The following command lists all datasets in your default project."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!bq ls"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The response should look like the following:\n",
"\n",
"```\n",
" datasetId \n",
" ------------------------------ \n",
" your_new_dataset \n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load data from a local file to a table\n",
"\n",
"The following example demonstrates how to load a local CSV file into a new or existing table. See [SourceFormat](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.SourceFormat.html#google.cloud.bigquery.job.SourceFormat) in the Python client library documentation for a list of available source formats. For more information, see [Loading Data into BigQuery from a local data source](https://cloud.google.com/bigquery/docs/loading-data-local) in the BigQuery documentation."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!bq \\\n",
" --location=US \\\n",
" load \\\n",
" --autodetect \\\n",
" --skip_leading_rows=1 \\\n",
" --source_format=CSV \\\n",
" {dataset_id}.us_states_local_file \\\n",
" 'resources/us-states.csv'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load data from Cloud Storage to a table\n",
"\n",
"The following example demonstrates how to load a local CSV file into a new table. See [SourceFormat](https://googleapis.github.io/google-cloud-python/latest/bigquery/generated/google.cloud.bigquery.job.SourceFormat.html#google.cloud.bigquery.job.SourceFormat) in the Python client library documentation for a list of available source formats. For more information, see [Introduction to loading data from Cloud Storage](https://cloud.google.com/bigquery/docs/loading-data-cloud-storage) in the BigQuery documentation."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!bq \\\n",
" --location=US \\\n",
" load \\\n",
" --autodetect \\\n",
" --skip_leading_rows=1 \\\n",
" --source_format=CSV \\\n",
" {dataset_id}.us_states_gcs \\\n",
" 'gs://cloud-samples-data/bigquery/us-states/us-states.csv'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Run a query\n",
"\n",
"The BigQuery command-line tool has a `query` command for running queries, but it is recommended to use the [magic command](./BigQuery%20Query%20Magic.ipynb) for this purpose."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Cleaning Up\n",
"\n",
"The following code deletes the dataset created for this tutorial, including all tables in the dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!bq rm -r -f --dataset $dataset_id"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}